PolyA-iEP: A data mining method for the effective prediction of polyadenylation sites

نویسندگان

  • George Tzanis
  • Ioannis Kavakiotis
  • Ioannis P. Vlahavas
چکیده

This paper presents a study on polyadenylation site prediction, which is a very important problem in bioinformatics and medicine, promising to give a lot of answers especially in cancer research. We describe a method, called PolyA-iEP, that we developed for predicting polyadenylation sites and we present a systematic study of the problem of recognizing mRNA 3 ́ ends which contain a polyadenylation site using the proposed method. PolyA-iEP is a modular system consisting of two main components that both contribute substantially to the descriptive and predictive potential of the system. In specific, PolyAiEP exploits the advantages of emerging patterns, namely high understandability and discriminating power and the strength of a distance-based scoring method that we propose. The extracted emerging patterns may span across many elements around the polyadenylation site and can provide novel and interesting biological insights. The outputs of these two components are finally combined by a classifier in a highly effective framework, which in our setup reaches 93.7% of sensitivity and 88.2% of specificity. PolyA-iEP can be parameterized and used for both descriptive and predictive analysis. We have experimented with Arabidopsis thaliana sequences for evaluating our method and we have drawn important conclusions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Genome-wide identification and predictive modeling of tissue-specific alternative polyadenylation

MOTIVATION Pre-mRNA cleavage and polyadenylation are essential steps for 3'-end maturation and subsequent stability and degradation of mRNAs. This process is highly controlled by cis-regulatory elements surrounding the cleavage/polyadenylation sites (polyA sites), which are frequently constrained by sequence content and position. More than 50% of human transcripts have multiple functional polyA...

متن کامل

Transcriptional activity regulates alternative cleavage and polyadenylation

Genes containing multiple pre-mRNA cleavage and polyadenylation sites, or polyA sites, express mRNA isoforms with variable 3' untranslated regions (UTRs). By systematic analysis of human and mouse transcriptomes, we found that short 3'UTR isoforms are relatively more abundant when genes are highly expressed whereas long 3'UTR isoforms are relatively more abundant when genes are lowly expressed....

متن کامل

Budding yeast telomerase RNA transcription termination is dictated by the Nrd1/Nab3 non-coding RNA termination pathway

The RNA component of budding yeast telomerase (Tlc1) occurs in two forms, a non-polyadenylated form found in functional telomerase and a rare polyadenylated version with unknown function. Previous work suggested that the functional Tlc1 polyA- RNA is processed from the polyA+ form, but the mechanisms regulating its transcription termination and 3'-end formation remained unclear. Here we examine...

متن کامل

A Robust Methodology for Prediction of DT Wireline Log

DT log is one of the most frequently used wireline logs to determine compression wave velocity. This log is commonly used to gain insight into the elastic and petrophysical parameters of reservoir rocks. Acquisition of DT log is, however, a very expensive and time consuming task. Thus prediction of this log by any means can be a great help by decreasing the amount of money that needs to be allo...

متن کامل

Pilot genome-wide study of tandem 3′ UTRs in esophageal cancer using high-throughput sequencing

Regulatory regions within the 3' untranslated region (UTR) influence polyadenylation (polyA), translation efficiency, localization and stability of mRNA. Alternative polyA (APA) has been considered to have a key role in gene regulation since 2008. Esophageal carcinoma is the eighth most common type of cancer worldwide. The association between polyA and disease highlights the requirement for com...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Expert Syst. Appl.

دوره 38  شماره 

صفحات  -

تاریخ انتشار 2011